Limited variation in microbial communities across populations of Macrosteles leafhoppers (Hemiptera: Cicadellidae)

Abstract Microbial symbionts play crucial roles in insect biology, yet their diversity, distribution, and temporal dynamics across host populations remain poorly understood. In this study, we investigated the spatio‐temporal distribution of bacterial symbionts within the widely distributed and economically significant leafhopper genus Macrosteles, with a focus on Macrosteles laevis. Using host and symbiont marker gene amplicon sequencing, we explored the intricate relationships between these insects and their microbial partners. Our analysis of the cytochrome oxidase subunit I (COI) gene data revealed several intriguing findings. First, there was no strong genetic differentiation across M. laevis populations, suggesting gene flow among them. Second, we observed significant levels of heteroplasmy, indicating the presence of multiple mitochondrial haplotypes within individuals. Third, parasitoid infections were prevalent, highlighting the complex ecological interactions involving leafhoppers. The 16S rRNA data confirmed the universal presence of ancient nutritional endosymbionts—Sulcia and Nasuia—in M. laevis. Additionally, we found a high prevalence of Arsenophonus, another common symbiont. Interestingly, unlike most previously studied species, M. laevis exhibited only occasional cases of infection with known facultative endosymbionts and other bacteria. Notably, there was no significant variation in symbiont prevalence across different populations or among sampling years within the same population. Comparatively, facultative endosymbionts such as Rickettsia, Wolbachia, Cardinium and Lariskella were more common in other Macrosteles species. These findings underscore the importance of considering both host and symbiont dynamics when studying microbial associations. By simultaneously characterizing host and symbiont marker gene amplicons in large insect collections, we gain valuable insights into the intricate interplay between insects and their microbial partners. Understanding these dynamics contributes to our broader comprehension of host–microbe interactions in natural ecosystems.


INTRODUCTION
Microorganisms are diverse and abundant across all kinds of habitats on Earth, but the microbiota of eukaryotic organisms have attracted particular attention due to their ecological, evolutionary, biomedical, and agricultural significance (McFall-Ngai et al., 2013).These host-microbe associations are diverse taxonomically and functionally, especially in insects: ranging from mutualism, through commensalism, to pathogenicity, they have, in many cases, dramatically influenced the biology and evolution of hosts (Douglas, 2015).In insects, three main symbiont categories are typically distinguished.First, obligate nutritional endosymbionts are strictly heritable, endocellular microbes critical to the survival, function, and reproduction of the insect host as they provide essential amino acids and vitamins lacking in their diet, filling the gaps in the nutritional needs of their host (Baumann, 2005;Moran et al., 2008;Sudakaran et al., 2017).Due to their indispensable roles and strict maternal transmission, nutritional endosymbionts are expected to occur in all individuals of a species (Baumann, 2005;Moran et al., 2008).The second functional category, facultative endosymbionts, is more variable in their distribution across species, populations, and host tissues, as well as in their transmission mechanisms.Although not critical for survival, they may provide important benefits that are often related to the defense or manipulation of host reproduction (Sudakaran et al., 2017).Facultative symbionts are typically transmitted maternally (vertically) but can occasionally transmit across host lines or species (horizontally).Third, diverse microorganisms frequently colonize the host gut or body surfaces.Some of these microbes, inducing a wide range of effects, have co-diversified with hosts for a long time, but others form less stable associations (Sudakaran et al., 2017).Finally, pathogens-whether attacking a host species or vectored by it-form an important fourth category of microbes that may not conform to the definition of symbiosis as a long-term interaction among organisms but can play important roles in host biology.
While these ancient symbiotic relationships can be stable, symbiont replacement and complementation by other symbionts that provide nutrients have occurred several times throughout evolutionary history in most Auchenorrhyncha groups.For example, in many lineages of cicadas, Hodgkinia has been replaced by Ophiocordyceps fungi (Matsuura et al., 2018).In only rare cases, hosts seem to have lost their symbionts entirely, such as in the largely mesophyll-feeding leafhopper subfamily Typhlocybinae (Moran et al., 2005).At the same time, infections with facultative symbionts have often been reported, such as Arsenophonus (Michalik et al., 2023) whose function is generally unknown despite instances of nutritional provisioning (Nov akov a et al., 2009), or Wolbachia (Qu et al., 2013) which has been estimated to occur in about 50% of all insect species (Kaur et al., 2021).Rickettsia is another widespread symbiont in many insect groups (Pilgrim et al., 2021).Despite their common existence, the functions of facultative symbionts are rarely well understood.These secondary symbionts may manipulate reproduction or offer protection against biotic and abiotic challenges, including pathogens, parasites, toxic chemicals, or heat (Oliver et al., 2010), Indeed, in species such as the pea aphid, extensive fluctuation in the prevalence of facultative endosymbiont species across populations and seasons was observed, driven to at least some extent by their defensive effects (Smith et al., 2015(Smith et al., , 2021)).However, in Auchenorrhyncha, little is known about the composition and function of gut microbiota (Michalik et al., 2023), despite their importance for many other insects (Engel & Moran, 2013).
Auchenorrhyncha are abundant in many ecosystems and serve as food for many other organisms such as birds and spiders, are hosts to parasitoid wasps and may cause direct damage to plants through feeding or oviposition (Dietrich, 2009).However, more importantly, they vector plant pathogens, such as the agriculturally important genus Candidatus Phytoplasma (Mollicutes) (hereafter Phytoplasma).These phloem-limited plant parasitic bacteria can cause extensive damage to more than 700 plant species, including maize (Ramos et al., 2020), carrot (Clements et al., 2021), grapevine (Moussa et al., 2023), and many other important crops.Therefore, managing these vector species is of great economic importance even though economic loss estimates are lacking.More than 75% of the known Phytoplasma vectors belong to the leafhopper subfamily Deltocephalinae (Weintraub et al., 2019).This group includes the leafhopper genus Macrosteles, a widespread and morphologically elusive genus with around 90 species described to date, several of which are known to vector phytoplasmas (Frost et al., 2011;Weintraub et al., 2019).
The association of Macrosteles spp. with ancient nutritional endosymbionts Sulcia and Nasuia has been comprehensively studied, with M. quadrilineatus as one of the model systems for the study of the mechanisms of these interactions (Bennett et al., 2016;Mao & Bennett, 2020).Symbiotic associations were also surveyed in M. striifrons and M. sexnotatus from Japan where Sulcia and Nasuia were reported alongside facultative symbionts Rickettsia and Wolbachia, and putative gut bacteria Burkholderia, Pantoea, and others (Ishii et al., 2013).Further, M. laevis, characterized using microscopy and sequencing-based techniques, in addition to Sulcia and Nasuia was shown to harbour the symbiont Arsenophonus in a particularly unusual location: within the cytoplasm of Sulcia cells (Kobiałka et al., 2016).But while we know the identity of some of Macrosteles symbionts, little is known about the diversity of their microbiota and its spatio-temporal variation across species and populations.
Given the lack of such research combined with the economic incentive of surveying a vector species, the goal of this study was to comprehensively survey and characterize spatio-temporal patterns of microbial diversity and distribution in Polish and European populations of M. laevis and a few other species in the genus Macrosteles.We aimed, first, to test the utility of COI amplicon sequencing and barcoding (short standardized DNA sequences used to identify species [Chac & Thinh, 2023]) as a means of delimiting host species, verifying their identity, and detecting parasites or parasitoids, as potential explanatory variables of microbiome composition.Second, we wanted to explore the genetic diversity among the host species and populations, asking whether the intraspecific genetic structure could explain the symbiont diversity patterns.Third, we aimed to characterize the diversity and distribution of Macrosteles symbionts and the variation in M. laevis microbiota across populations and over time.Finally, we wanted to investigate potential patterns of Phytoplasma infection and its correlation with the presence of other microbes.We addressed these goals by sequencing insect and bacterial marker gene amplicons for the collection of over 300 specimens from different populations.

Specimen acquisition and identification
The majority of specimens were collected using sweep netting in Poland between 2015 and 2020, and additional samples were collected or obtained from the United Kingdom, United States and Sweden (Supplementary Tables S1 and S2).Specimens, identified to species level based on morphological features, were stored in 95% ethanol in a freezer until processing.Species identifications included M. laevis, M. sexnotatus, M. cristatus, M. maculosus, M. viridigreseus and M. quadrilineatus, but M. laevis was by far the most abundant.In total, 371 specimens were processed, and for 307 we obtained reliable data and included them in this study (as explained below).A substantial portion of the samples included 158 individuals of M. laevis, collected from two distinct Polish agroecosystems: one in Szczecinek and the other in So snicowice, spanning the years 2015-2018, with 40 insects collected each year.

DNA preparation and extraction
The DNA of 157 insects collected in Szczecinek and So snicowice before 2020 was extracted using the NucleoSpin Tissue kit (Mecherey-Nagel).Whole specimens were placed in tubes, filled with a lysis buffer provided by the manufacturer, and homogenized using a mini pestle.For further steps, the manufacturer protocol was followed.For amplicon sequencing, we selected samples that tested positive and negative for Phytoplasma infections in the earlier nested PCR screen, which followed the standard method for Phytoplasma detection as previously described (Zwoli nska & Borodynko-Filas, 2021).Our goal was to include 20 samples from each of the two categoriespositive and negative-for each site-date combination, although there were instances where fewer samples were available.
A total of 21 samples from Poland, Kampus UJ, were processed using three different extraction approaches: DNeasy PowerSoil (Qiagen) and DNeasy Blood and Tissue kit (Qiagen), for which the manufacturer protocols were followed, and a custom protocol, described below.For each of the three kits, seven specimens were processed, four as whole specimens and with only the abdomens used for the remaining three.As we found no consistent differences among the protocols, the remaining 193 samples were processed using whole specimens and our custom protocol.
For our custom protocol, pecimens were each placed in 2 mL screw-cap tubes together with a lysis buffer (GitHub repository: https://github.com/SymSandra/Macrosteles-project/), proteinase K and ceramic beads.Samples were then homogenized in the homogenizer for 30 s at 5 m/s two times, and incubated in a thermal block at 56 degrees Celsius for 2 h.20% of the homogenate for each sample was used for DNA purification using SPRI magnetic beads (Seramag, SpeedBeads magnetic carboxylate modified particles, Cytiva).The remaining product was stored in a freezer.
To track possible contamination in samples, negative controls were used in the form of water for both extraction and subsequent steps of PCR and indexing, amounting to 24 control samples in total for all 371 samples processed.

Amplicon library preparation and sequencing
During the first step, we targeted two regions-insect COI, using primers BF3 and BR2 (Elbrecht & Leese, 2017) and hypervariable regions V1-V2 and V4 of bacterial 16S rRNA gene, using primers 27F and 338R (Walker et al., 2015) and 515F and 806R (Apprill et al., 2015;Parada et al., 2016), respectively.These primer pairs, with partial Illumina adapters at the 5 0 end, were combined in a single multiplex reaction with Qiagen Multiplex DNA polymerase.Then, after purification using SPRI magnetic beads, PCR products were used as templates for the second indexing PCR, with a different set of uniquely labeled oligonucleotides used as primers for each sample.Subsequently, PCR products were verified and roughly quantified using a 2.5% agarose gel stained with Midori Green.Libraries were then pooled based on gel band intensity, cleaned with Promega magnetic beads, and sequenced in a Illumina MiSeq V3 lane (2 Â 300 bp reads) at the Institute of Environmental Sciences of Jagiellonian University.All protocols are provided and described in the following GitHub repository: https://github.com/SymSandra/Macrosteles-project/.

Amplicon data analysis
Amplicon data were processed following a pipeline detailed at a dedicated GitHub page (https://github.com/SymSandra/Macrosteles-project/), closely resembling that described in recent publications (Kolasa et al., 2023;Valdivia et al., 2023).Briefly, reads were divided into bins corresponding to the targeted marker regions based on primer sequences.Then, each bin was analyzed separately using custom protocols combining vsearch, usearch, and custom steps.Forward and reverse reads were assembled into contigs and quality-filtered.Contigs were dereplicated, and representative sequences were chosen, with singleton sequences removed from further analysis.Samples were denoised with usearch and aligned against a customized reference database based on MIDORI (Leray et al., 2018) for COI and SILVA 138 (Quast et al., 2012) for 16S rRNA.For analysis of V1-V2 and V4 16S rRNA gene sequences, we used UCHIME (Edgar et al., 2011) to screen and remove chimeric sequences prior to taxonomy assignment.Finally, the sequences were clustered at 97% identity level, and Operational Taxonomic Unit (OTU) and zOTU (zero-radius OTU) tables were produced.We chose the most abundant COI sequence variant classified as Macrosteles as the barcode for each sample.
Finally, we used a decontamination procedure for 16S rRNA samples, using as references multiple negative control samples for different steps of library preparation (DNA extraction, first PCR, and a second PCR with indexing).The custom Python script recognizes and filters out OTUs assigned based on the comparison of the relative abundance of genotypes among negative controls and samples, as well as removing OTUs assigned as mitochondria, chloroplasts, Eukaryotes, or Archaea (for more details, see the GitHub repository).Resulting data tables were manually filtered so that only the zOTUs represented by at least 100 reads and representing at least 5% in at least one sample, were specifically considered in distribution analyses, with the remainder classified as 'Others'.Samples with less than 100 COI reads representing the barcode (that is, the dominant Macrosteles genotype), or with less than 1000 16S-V4 reads, were excluded from all comparisons.

Amplicon sequencing data overview
Out of 371 insects initially processed for this study, 65 were excluded due to an insufficient number of reads for either the COI region (≧100), preventing molecular identification, or for the V4 region of the 16S rRNA (≧1000).Most of the discarded specimens represented the Swedish collection, stored under suboptimal conditions which led to greater DNA degradation.
Consequently, 306 libraries were used for core analyses combining COI and V4 data.
The amount of data for the V1-V2 region of the 16S rRNA gene was generally less than for V4, and 77 additional libraries were further excluded from the comparison among these two gene regions.V1-V2 data were not used for other comparisons.

COI-based host species delimitation and genotype diversity
Across 306 samples, we identified 215 COI genotypes (zOTUs), clustering into 43 OTUs.44 of these genotypes, representing a large majority of reads (90.8%) were clustered into 7 OTUs which were taxonomically annotated as one of seven Mactosteles spp.(Figure 1A) and considered as representing mitochondrial COI sequences, or barcodes (Supplementary Tables S3 and S4).We identified another group of OTUs also classified as Macrosteles, but always accompanying the 'barcode' OTUs and not exceeding 10% of the total number of reads in the library, but generally closer to 1%.We assume that these secondary OTUs represent nuclear pseudogenes (numts), or other biologically inactive sequences.However, a curious case is OTU10, comprising between 0 and 6% of reads in all M. laevis individuals except for one, SWEAM3, where it makes up 75%, dominating over the 'barcode' OTU1.We are not sure whether this represents a true biological phenomenon, or a laboratory artefact, for example, related to the patterns of DNA degradation.
Except for 10 specimens, COI sequence-based taxonomic annotations matched morphology-based IDs, but in the remaining cases, the barcode sequences matched different Macrosteles species than expected.Specifically, five out of six specimens previously identified as M. sexnotatus and one specimen of M. cristatus had barcodes matching M. laevis.One specimen identified as M. laevis belonged to the species M. cristatus.One specimen of M. cristatus matched the barcode of M. viridigreseus.Lastly, two specimens morphologically identified as M. laevis had the closest match to M. frontalis after a Blast search against N. Given the known challenges with morphology-based identifications of this elusive genus, the results of barcoding were used for further analysis.
The analysis of intra-OTU COI genotype diversity provided interesting insights into Macrosteles population structure.In M. laevis, one genotype-a consensus sequence for all genotypes in the OTU-dominated the profiles of most specimens from all locations (Figure 1B-D).However, in 43 out of 270 (15.9%)M. laevis individuals from across populations and sampling years, we identified other genotypes that contributed >5% of the total OTU and usually co-occurred with the consensus one.To identify sequence variants that are biologically realistic, rather than representing sequencing artefacts, for the characterization of genotype-level patterns, we only considered genotypes represented by ≧100 reads and with a relative abundance threshold of ≧5% of at least one library.For the most abundant species, M. laevis, where the barcode was represented by OTU1, 27 out of 60 zOTUs in that OTU exceeded this relative abundance threshold (Figure 1D).Almost all of these genotypes differed at only a single nucleotide position from the dominant one, but their distribution showed no consistent geographic or temporal patterns (Figure 1D).We interpret these patterns as heteroplasmy-the presence of more than one mitochondrial genotype within an individual (Avise, 2000), likely resulting from recent point mutations that have spread but not yet reached fixation within individuals or populations.Likewise, we found more than one COI genotype in M. viridigreseus, with three equally abundant genotypes in the British population.The single Polish specimen of M. viridigreseus had a distinct genotype from those occurring in the British population.In M. quadrilineatus, two genotypes were found, with one genotype clearly dominant across the samples but with two specimens displaying a second genotype.
COI amplicon data inform about parasitoid, Wolbachia, and Rickettsia infections Interestingly, a significant share of reads in our COI dataset did not match leafhoppers, instead providing insights into their interactions with other organisms.Two OTUs matched NCBI database records of Eudorylas fuscipes and Pipunculus omissinervis (both Diptera: Pipunculidae) with 100% and 98.8% identity, respectively.The first (OTU2, E. fuscipes) was represented by four genotypes distributed across 11 individuals, including 2 where it was represented by >100 reads.The second (OTU16, P. omissinervis) occurred in a single specimen only.Species of the family Pipunculidae are known parasitoids of Auchenorrhyncha and E. fuscipes is known to parasitize M. laevis (Ba nkowska, 1989).Both parasitoids occurred only in Polish specimens of M. laevis over several years.
Another set of COI OTUs matched Rickettsiales bacteria Wolbachia and Rickettsia, known as facultative endosymbionts in a wide variety of insects.Wolbachia OTU5 was present in 2 specimens of M. viridigreseus, one specimen of M. laevis, and 1 specimen of M. maculosus, and OTU22-in two specimens of M. laevis.Rickettsia was represented by two COI OTUs, each in a distinct M. laevis individual.

Microbiota diversity comparison based on two 16S rRNA regions
Across 229 libraries with at least 1000 reads for both V4 and V1-V2 regions of the 16S rRNA gene (Tables S5-S8), we compared the presence of more abundant clades, focusing on OTUs that exceeded 1% relative abundance in at least one library.For the V4 region, we identified 23 such OTUs, together comprising 99.88% of reads in a library on average (Table S10).For the V1-V2 region, we identified 29 such OTUs, comprising 99.83% of reads in a library on average (Supplementary Table S9).
Both datasets were dominated by the OTUs of heritable endosymbionts.Sulcia OTU and Arsenophonus OTUs were abundant in both datasets, although in the latter case, differential clustering of genotypes into OTUs was particularly evident (Figure 2).On the other hand, Nasuia was well-represented in only the V4 dataset-identified in every library and comprising 30.92% of reads on average, whereas in the V1-V2 dataset, it was only found in 51 libraries, at low abundance.Facultative endosymbionts Cardinium, Rickettsia, Wolbachia and unclassified Gammaproteobacterium were better represented in the V1-V2 dataset than in the V4 dataset, as was Phytoplasma (Figure 2).However, the six-fold greater V4 sequencing depth (on average) may explain why Phytoplasma was detected in fewer V1-V2 than V4 libraries (24 vs. 48).
The lack of overlap among the 16S rRNA regions has complicated direct OTU-to-OTU comparisons among less abundant bacteria, and we based further comparisons primarily at genus-level classification (Table S9), in some cases also using full-length 16S rRNA sequences from Genbank to establish equivalence.We blasted the dominant OTUs from V4 against the full-length database, extracted the sequences of top hits, and then used them as a database for blasting V1-V2 zOTUs.We observed that all genera represented by OTUs that exceeded 1% in at least one V4 library were represented among the more abundant OTUs in the V1-V2 dataset.Similarly, the bacterial genera that exceeded 1% relative abundance in at least one V1-V2 library had equivalents in the V4 dataset.
We conclude that except Nasuia, the V1-V2 and V4 regions of the 16S rRNA gene identify overlapping bacterial clades and reconstruct similar distributions.Many less abundant genera have higher representation in the V1-V2 dataset, which could have facilitated their detection in the case of even sequencing coverage.However, because of the much lower overall read numbers for the V1-V2 region, we have decided to rely on V4 region data only for the subsequent analyses.
To show how hosts' mitochondrial diversity and sampling location influence microbiome variation, we conducted dbRDA (Suppl.Figure 1) with microbiome composition (Jaccard presence/absence, Suppl.Figure 1A, and Bray-Curtis abundance index, Suppl.Figure 1B) as dependent and dominant OTU and location as explanatory variables.Additionally, we performed an analysis of variance (ANOVA) to show how factors included in dbRDA explain the response variable (Suppl.Table S11).dbRDA showed that host mitochondrial diversity and sampling location explain 22.54% (Jaccard) or 19.89% (Bray-Curtis) of variance in microbiome composition among studied specimens.

The endosymbiont diversity across Macrosteles species
The microbial communities of the surveyed Macrosteles leafhoppers were dominated by the previously reported heritable endosymbionts (Figure 3A, B).All specimens hosted the obligate nutritional symbionts Sulcia and Nasuia.These two bacteria dominated the microbiota in the vast majority of specimens, with Sulcia making up 45.3% of the microbial community on average (range 4.12%-84%), and Nasuia 26.6% on average (range 1.95%-80%).All characterized Macrosteles leafhoppers, regardless of the species, hosted the same Sulcia genotype.Nasuia was represented by ten genotypes grouped into two OTUs, with each species hosting a distinct genotype or a set of genotypes.Some M. laevis specimens harbored a distinct genotype from the dominant one for the species, with a single (zOTU038) or two-nucleotide difference (zOTU33 and zOTU41).
We also identified OTUs matching known facultative endosymbionts of insects in the majority of studied individuals.Most notably, OTUs identified as Arsenophonus occurred in 222 of 270 (82.2%)M. laevis specimens, with relative abundance within libraries ranging from 0.004% to 68.8%.In M. laevis, we detected four OTUs classified to the genus Arsenophonus that were consistently present in specimens.Because of the consistent abundance of these OTUs  prevalence within populations ranged from 0% to 100%, with no consistent differences across sites or sampling dates.The same pattern of several cooccurring OTUs was present in M. cristatus and M. sexnotatus.However, in other Macrosteles species, Arsenophonus was generally represented by different OTU combinations and made up less than 10% of reads on average.
Facultative endosymbiont Wolbachia, represented by two OTUs, was relatively rare and found only in a small proportion of specimens.The OTU7 of Wolbachia was present in two specimens of M. viridigreseus, one of M. laevis, one of M. maculosus, and one of M. cristatus, from the Polish population.Notably, in four out of these five specimens, Wolbachia was also detected in COI amplicon data.Wolbachia OTU73 was present in 9 M. laevis specimens, and its relative abundance did not exceed 0.77%.Notably, five of these specimens were among the 12 that contained parasitoid Eudorylas fuscipes COI reads.It seems plausible that the symbiont infected the parasitoid rather than the leafhopper, but the low relative amounts of parasitoid tissue hindered their reliable co-detection.
In the V4 dataset, we also found Candidatus Trichorickettsia in two M. laevis specimens, each from a different Polish population.Further, most individuals of M. quadrilineatus were infected by a bacterial OTU17, likely a Gammaproteobacterium, but without close similarity to any named bacteria in the NCBI database.
We also discovered two putative facultative endosymbionts not previously reported in Macrosteles leafhoppers.First, we found Cardinium in all M. quadrilineatus and in seven out of eight M. cristatus from Poland and one from Sweden.It was also found in two of the British M. viridigreseus specimens.The one specimen of M. viridigreseus missing the symbiont Cardinium was infected by Wolbachia.Finally, we identified an Alphaproteobacterium classified as Lariskella in three specimens of M. maculosus with a relative abundance ranging from 0.9% to 1.17%.Combined, the seven symbionts listed above made up 97.2% of reads in a library on average, and together with Phytoplasma described in the next section, 98.2%.
The other microbes were less common and abundant in the dataset.They included Enterobacterales genus Pantoea, Rhizobiales genus Methylobacterium, Pseudomonadales genera Acinetobacter and Pseudomonas, Burkholderiales genera Massilia, Acidovorax, Ralstonia, Janthinobacterium, Dechloromonas, and Robbsia, Sphingobacterales genus Pedobacter, and others.Of these, Pantoea OTU019 was relatively most widespread and abundant, infecting 25.8% of all specimens, including 27% of specimens in M. laevis and 10.5% in M. quadrilineatus, 37.5% in M. cristatus and 33.3% in M. maculosus.It made up an average of 0.2% of the microbial community (max.14.05%).Other OTUs were less prevalent and abundant, and none of them showed consistent associations or distribution patterns across species and populations that we would expect in cases of a specific and biologically significant association (Supplementary Tables S5 and S6).

Surveying Phytoplasma prevalence across Macrosteles individuals
The plant-pathogenic bacterium Phytoplasma was detected in 57 individuals, but was generally found at relatively low abundance: in one individual, it was 44.8% and in a few others $10%, but typically, it was well below 1% (Figure 4B, Supplementary Tables S5 and S6).This large variation among individual insects in Phytoplasma relative abundance may explain the limited overlap between its detection based on amplicons as opposed to diagnostic nested PCR (Zwoli nska & Borodynko-Filas, 2021).When looking at the 156 M. laevis individuals from Szczecinek and So snicowice (Poland) that were pre-selected for microbiome characterization based on the results of Phytoplasma screen, we found a discrepancy between Phytoplasma detection based on nested PCR and amplicon data (Figure 4A).Out of 59 specimens selected as Phytoplasma-positive based on nested PCR, in only 24 we found 16S rRNA reads representing this bacterium.At the same time, out of 97 specimens selected as Phytoplasma-negative based on nested PCR, we found 16S rRNA reads matching this microbe in 18 samples.While Phytoplasma was more likely to be detected in amplicon data for nested PCR-positive specimens, 16S rRNA amplicon sequencing on its own appears insufficient as a means of detecting Phytoplasma presence.Outside of these pre-screened populations, the remaining occurrences of Phytoplasma were found in 11 M. laevis from other populations, single specimens of M. frontalis and M. sexnotatus, and two specimens of M. viridigreseus.
At the genotype level, Phytoplasma was represented by 13 zOTUs (Supplementary Table S6).The dominant genotype (zOTU14) was found in 35 M. laevis specimens from Szczecinek-So snicowice collections and was also detected in three additional Polish populations.Two other genotypes, zOTU91 and zOTU100, co-occurred together in all six specimens from four Polish populations of M. laevis at a near 50/50 ratio, suggestive of them representing distinct rRNA sequence variants within a single genome.
Further genotypes, zOTU230 and zOTU423, only occurred in specimens from Szczecinek sampled in 2017, and six of seven specimens were co-infected with the dominant genotype (zOTU14).Additionally, zOTU451 only occurred in two specimens from So snicowice from 2016.The remaining lowerabundance genotypes occurred in Szczecinek-So snicowice collection in a single specimen.Six of the low abundance genotypes occurred in only one specimen.Besides M. laevis, Phytoplasma (zOTU14) only occurred in one specimen of M. sexnotatus, one specimen of M. frontalis, and two specimens of M. viridigreseus in which one had an additional genotype (zOTU91).The genotype zOTU137 was only present in two specimens from Opawskie Mountains.
We also asked whether the Phytoplasma abundance/presence correlates with the presence of other microbes in the Szczecinek-So snicowice collection.We looked specifically at the correlation with Arsenophonus as other microbes were too uncommon.We did not find a correlation between Arsenophonus presence or absence and Phytoplasma presence or absence across samples of M. laevis from So snicowice and Szczecinek (Figure 4C) (Chi-square test: χ 2 = 0.53, df = 1, p < 0.46 for all Phytoplasma reads and χ 2 = 0.04, df = 1, p < 0.83) for all Phytoplasma with relative abundance of at least 5%.

DISCUSSION
Our microbiome surveys across seven species of Macrosteles leafhoppers revealed the conserved nature of their ancient nutritional symbioses with Sulcia and Nasuia, and substantial variation among species in the presence of other symbiotic bacteria (Ishii et al., 2013).At the same time, the study across 270 individuals of Macrosteles laevis, representing several populations and sampling points in Poland and beyond, showed no discernable patterns in the prevalence of other symbionts among or within populations.Although both hosts' genetic diversity and sampling location explained a significant proportion of microbiome variance (Suppl.Figure 1A,B, Suppl.Table S2) we didn't observe any grouping patterns.In both cases, the simultaneous characterization of host and symbiont marker gene data was instrumental in describing and characterizing the patterns.Despite the growing popularity of insect COI gene sequencing, or barcoding, as a part of biodiversity surveying efforts (Srivathsan et al., 2023), this approach has rarely been incorporated into the study of host-associated microbiota.Here, COI amplicon data provided several pieces of information needed to interpret microbiome data.
First, morphology-based identification of wild-caught insects can be challenging, but the use of COI data can clarify host identity.In our dataset, taxonomically annotated COI data did reveal some misidentifications, supporting the view of the genus Macrosteles as an elusive group with a high level of morphological identification difficulty.Most identification is based on males since females often lack distinguishing characteristics (Holzinger et al., 2003).However, molecular data proved useful in clearing up taxonomic mistakes and ambiguities.
Second, the composition of the microbial community, and especially heritable symbioses, is known to often correlate with the host phylogeny.COI amplicon data can provide us with at least some insight into intraspecies genetic diversity and population history.In a recent broad geographic survey across populations of the spittlebug Philaenus spumarius, COI barcodes highlighted the divergence among major clades and revealed smaller-scale genetic variation among populations, which often correlated with symbiont associations (Kolasa et al., 2023).In M. laevis, no comparable patterns were found.Instead, the identification of the single dominant genotype accompanied by a cloud of genotypes 1-2 bases apart, with no consistent trends or differences among populations, suggested their relatively recent shared ancestry and ongoing diversification.The discovery of substantial levels of heteroplasmy, generally including the dominant COI variant plus one of the variants one base apart, further supports that interpretation.In M. laevis, the discovery of that intra-species COI diversity did not help to explain the patterns of microbial associations.However, it would be fascinating to explore its effects in other insect clades where such diversity and heteroplasmy were reported, including neotropical ants (Meza-L azaro et al., 2018), thrips (Frey & Frey, 2004), the eastern honeybee (Apis cerana) (Songram et al., 2006) and periodical cicadas (Fontaine et al., 2007) among others.
Third, COI data for individual insects can provide valuable information about parasitoid or parasite infections.These natural enemies, known to strongly influence insect population dynamics (Abram et al., 2019), are likely to alter microbiome profiles-whether by selecting hosts with specific microbiota, shifting the microbial community of the attacked host, or contributing parasites' microbes to the community profile (Dicke et al., 2020).Indeed, in P. spumarius, one of Wolbachia OTUs only occurred in specimens infected by its specialized parasitoid, suggesting the parasitoid is the source of Wolbachia (Kolasa et al., 2023).Likewise, known Auchenorrhyncha parasitoids Pipunculus omissinervis and Eudorylas fuscipes were detected in some Macrosteles specimens.Several of those carrying E. fuscipes also contained a Wolbachia OTU rare in other samples, suggesting the parasitoid as its source.The imperfect correlation between the reconstruction of parasitoids and symbiont presence may be highlighting the limits of their amplicon-based detection.However, COI data can provide additional information on the presence of alphaproteobacterial symbionts Wolbachia and Rickettsia, confirming their 16S rRNA-based detection and improving phylogenetic resolution.In our dataset, we found a high correlation between the presence of these microbes in COI and 16S rRNA datasets for the same insects, and we are now working on formally comparing their detection capacity across a diverse range of insects.
The comparison of microbial community reconstructions based on two targeted 16S rRNA regions concluded that the V1-V2 region data was redundant, not contributing additional information relative to the V4 region data while failing to capture the signal of Nasuia.Overall, the microbiome data emphasized the stability of ancient nutritional heritable symbioses, previously demonstrated in many clades of Auchenorrhyncha, but not universal across their phylogenetic tree (Bennett & Moran, 2015), and rarely explored extensively within species.The lack of within-species variation in Sulcia and Nasuia 16S rRNA sequences further argues for limited divergence among the sampled populations.On the other hand, the slow rate of Sulcia genome evolution and limited differences among species within the genus have also been found in other Auchenorrhyncha clades (Campbell et al., 2015;Kolasa et al., 2023;Michalik et al., 2023).
In the genus Macrosteles, as in many other hemipteran groups, ancient symbionts are accompanied by accessory symbionts, especially Gammaproteobacteria.In several systems studied to date, they complement ancient symbionts' biosynthetic pathways and contribute additional nutrients, especially B vitamins (McCutcheon et al., 2019).In M. laevis, we found Arsenophonus across all M. laevis populations sampled in Poland, but in neither has it reached fixation, arguing against its essential role despite the very specific and unusual endobacterial localization (Kobiałka et al., 2016).Arsenophonus was also found in most other Macrosteles species except M. quadrilineatusalthough it has been shown to occur in wild populations of that species too (Bennett & Moran, 2013).We were initially surprised by the apparent diversity of Arsenophonus, represented by several OTUs across all sampled M. laevis individuals.However, Arsenophonus is notorious for carrying divergent copies of the 16S rRNA gene across distinct operons within a genome (Nov akov a et al., 2009(Nov akov a et al., , 2016;;Šorfov a et al., 2008), which seems to be the case also here.In P. spumarius, the presence of distinct sets of zOTUs of its gammaproteobacterial symbiont Sodalis in different individuals suggested different or divergent strains across populations (Kolasa et al., 2023); here, most individuals of M. laevis had the same base set of zOTUs with some scattered genotypes for various populations.Unfortunately, so far, we lack information on the phylogenetic relationships among strains infecting different Macrosteles species: they might represent an ancient symbiont co-diverging with hosts, facultative endosymbiont capable of occasional horizontal transmission, result from independent acquisitions, or perhaps a combination of these possibilities.Likewise, so far, we lack genomic data on the nature of this unusually localized microbe.In other species, Arsenophonus has been established both as an obligate long-term endosymbiont or as a more recently acquired facultative endosymbiont, as in louse flies ( Říhov a et al., 2023).In the future, genomics and experiments should be able to clarify both the relationships and potential roles of these common Macrosteles associates.
Other putative facultative endosymbionts-Wolbachia, Rickettsia, Cardinium, and Lariskella-were relatively rare in our collection and patchily distributed across species and populations.Wolbachia, known for the breadth of its distribution across insect diversity, its ability to manipulate host reproduction, and the antiviral and other benefits it can confer (Kaur et al., 2021;Weinert et al., 2015) was found in only a few individuals.Rickettsia and Cardinium are among the most widespread insect facultative endosymbionts, and have a range of effects comparable to Wolbachia (Weinert et al., 2015).We only found Rickettsia in M. laevis and M. viridigreseus but it has been reported previously in Japanese specimens of M. sexnotatus and M. striifrons (Ishii et al., 2013).Cardinium infected three Macrosteles species; while not reported from this genus before, it had been found in several species of planthoppers (Nakamura et al., 2009) and the leafhoppers Scaphoideus titanus (Marzorati et al., 2006) and Euscelidius variegatus (Chrostek et al., 2017).High infection prevalence in M. cristatus, M. quadrilineatus, and M. viridigreseus suggests that at least in these species, it plays a significant role.A particularly interesting case is Lariskella in M. maculosus, to our knowledge the first record of this bacterium in Auchenorrhyncha.This Alphaproteobacterium has been reported from stink bugs, but also other arthropods such as fleas and ticks (Matsuura et al., 2012).Its fitness roles are unknown, but in stinkbugs infected with the Lariskella, no skewed sex ratio was observed, suggestive of their other functional roles, including the potential provision of nutrients (Matsuura et al., 2012).
Microbes impact insects in many ways, altering each other's effects on host life history traits and biology (Cornwallis et al., 2023;Sudakaran et al., 2017).This includes reducing the transmissibility of vector pathogens by other host-associated microbes.For example, in Aedes aegypti mosquitoes, artificial infection with Wolbachia largely blocks the transmission of dengue virus to humans (Edenborough et al., 2021).Due to the economic importance of M. laevis as a vector of plant-pathogenic phytoplasmas, assessing the relationship between the symbiont complement and the presence/abundance of Phytoplasma was among the primary original goals of this study.However, for Arsenophonus, the only microbe sufficiently prevalent to enable testing for Phytoplasma-protective effects, no significant association was found.It would be useful to systematically search for any such protective effects of other symbionts of diverse Auchenorrhyncha.However, our study has also revealed a limitation of the amplicon-sequencing search for infection: the limited detection sensitivity relative to the routine testing technique, nested PCR (Hafez et al., 2005).Phytoplasma abundance within host tissues varies among different stages of infection (Frost et al., 2011), and it can be assumed that at the time it is low, it may contribute only a small share of all 16S rRNA copies in a specimen.Depending on the amplicon sequencing depth, it may well avoid detection.Such screens could also be complicated by cross-contamination among samples that can happen on Illumina platforms during sequencing (Kircher et al., 2012).Likely, changes in the library preparation methods, including the optimization of primers or perhaps combining universal and specific primers and the adjustment of sequencing depth, will make this approach much more reliable.Another approach could be the use of shotgun metagenomics which have been used successfully to unveil complex microbiomes.This method could aid the discovery of new endosymbionts (Quince et al., 2017).

CONCLUSION
We have shown that the simultaneous amplification and sequencing of host and microbial marker genes is a powerful technique for a broad survey of hostassociated microbiota.Through sampling across species, as well as across populations and separate dates for one of the species, we aimed to simultaneously and broadly reconstruct spatio-temporal patterns of host mitochondrial diversity and microbial community composition and use the former data to explain the latter.Such approaches, used repeatedly to characterize patterns in human microbiota (Gupta et al., 2017;Ma et al., 2014), have rarely been used for insects.We know of just two such approaches, one focused on aphid-parasitoid-symbiont networks (Ye et al., 2017) and the other on the spittlebug Philaenus spumarius (Kolasa et al., 2023).However, unlike the latter study, which demonstrated a clear genetic structure among sampled insects and significant differences in microbiota composition among populations, we found little diversity and few clear patterns in M. laevis.There were no consistent differences among geographically distant populations or sampling dates.Compared to P. spumarius, or pea aphids that also have been systematically surveyed for multiple bacterial associates (Pilgrim et al., 2021), M. laevis microbiota are surprisingly stable.Future work will show whether stability is an unusual feature of this one species.On the other hand, the comparison of M. laevis with other Macrosteles species revealed significant differences and a much greater diversity of microbial associates, expanding our understanding of the microbiota of this widely distributed and ecologically significant genus.With more species and populations of Macrosteles and associated microbiota and potential methodological modifications that could enhance the detection success of pathogens such as Phytoplasma, high-throughput multi-target amplicon sequencing has the potential to greatly enhance our understanding of microbiomerelated processes and patterns in natural populations and communities.

F
I G U R E 1 COI diversity in Macrosteles leafhoppers.(A) Maximum likelihood tree for the sampled Macrosteles species based on the dominant zOTU sequence from each respective OTU.Support values are based on 1000 bootstraps.(B) Sample locations for all specimens.Per-site sampling effort shown for Macrosteles laevis, with green circle size representing the number of specimens collected across all seasons.Hexagon colours correspond to species as in panel A. (C) Photo of M. laevis by Tomasz Klejdysz.(D) COI genotype diversity of M. laevis (OTU1) as the relative abundance of zOTUs in percentage based on the cutoffs of >100 reads and making up at least 5% of the library.Each bar corresponds to one specimen.Size-dependent circles correspond to sampling efforts across different years for sites with multiple sampling seasons.

F
I G U R E 2 The comparison of averaged relative abundances of dominant bacterial genera among amplicon datasets for 16S rRNA V4 and V1-V2 regions for the same 229 Macrosteles samples.relative to each other and the high degree of sequence divergence among rRNA operons within the genomes of previously characterized strains (Nov akov a et al., 2016), we concluded that the different OTUs present in the same specimens almost certainly represent a single microbial strain.The Arsenophonus

F
I G U R E 3 (A) Relative abundance of the dominant bacterial 97% OTUs across Polish populations of M. laevis, based on amplicon data for the V4 region of 16S rRNA Rows correspond to specimens, and columns-to bacterial OTUs.Column 'Total other' contains the sum of all other OTUs present in each specimen.White squares equal 0% relative abundance.For all sampled populations, the circle size represents the sampling depth for the population, and the black area of the circle-the proportions of individuals infected with the particular symbiont.(B) Symbiont relative abundance across populations of other Macrosteles species.Each row corresponds to one bacterial 16S rRNA OTU and each column corresponds to one specimen.Row 'Total other' contains the sum of all other OTUs present in each specimen.OTUs marked with an asterisk signify OTUs that did not pass the cutoff criteria but were selected to highlight distribution.The full line corresponds to females and the dotted line to males.

F
I G U R E 4 (A) The comparison in Phytoplasma detection efficiency among M. laevis individuals from Szczecinek and So snicowice that had previously been scored as Phytoplasma-positive or negative based on nested PCR.(B) Relative abundance of Phytoplasma in microbial communities of 42 pre-selected infected specimens from Szczecinek and So snicowice, based on amplicon sequencing.(C) Relative abundance of Phytoplasma in microbial communities of individuals where Arsenophonus was present (blue) versus absent (red).